Categorical data fusion using auxiliary information
نویسندگان
چکیده
منابع مشابه
On Proxy Variables and Categorical Data Fusion
The problem of inference about the joint distribution of two categorical variables based on knowledge or observations of their marginal distributions, to be referred to as categorical data fusion in this paper, is relevant in statistical matching, ecological inference, market research, and several other related fields. This article organizes the use of proxy variables, to be distinguished from ...
متن کاملValuing Indirect Citations in Citation Networks using Data Fusion
Any scientific activity requires awareness of previous related activities. Citation networks are the networks in which each document is compared as a link of a chain with its previous and next documents, and the documents with the highest number of citations are considered as the most effective ones in a domain. Most of the introduced methods use direct citations for valuing the documents. One ...
متن کاملAutomatic Image Annotation Using Auxiliary Text Information
The availability of databases of images labeled with keywords is necessary for developing and evaluating image annotation models. Dataset collection is however a costly and time consuming task. In this paper we exploit the vast resource of images available on the web. We create a database of pictures that are naturally embedded into news articles and propose to use their captions as a proxy for...
متن کاملEstimating Document Similarity using Auxiliary Category Information
We have developed a novel approach to determine the similarity of documents using probabilistic latent semantic indexing. For each document a probability vector of latent factors is estimated which on the one hand takes into account the distribution of words in the text and on the other hand the distribution of category values. The emphasis can be freely shifted between both aspects and therefo...
متن کاملUsing Auxiliary Information in Statistical Function Estimation
In many practical situations sample sizes are not sufficiently large and estimators based on such samples may not be satisfactory in terms of their variances. At the same time it is not unusual that some auxiliary information about the parameters of interest is available. This paper considers a method of using auxiliary information for improving properties of the estimators based on a current s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: The Annals of Applied Statistics
سال: 2016
ISSN: 1932-6157
DOI: 10.1214/16-aoas925